Hidden Markov Models: Applications to Flash Memory data and Hospital Arrival times

نویسندگان

  • Tiberiu Chis
  • Peter Harrison
  • William Knottenbelt
چکیده

A hidden Markov model (HMM) is a bivariate Markov chain which encodes information about the evolution of a time series. HMMs can faithfully represent workloads for discrete time processes and therefore be used as portable benchmarks to explain and predict the complex behaviour of these processes. This project introduces the main concepts of HMMs for discrete time series including a summary of HMM mathematical properties. A section of this report explains the motives behind cluster analysis and the most efficient selection of the clustering algorithm when creating workload models. In the case of this project, an explanation is provided into the benefits of the K-means clustering algorithm for data points in discrete time. The main aims of this project are to: apply HMMs to two different scenarios to correctly analyse discrete time series; provide meaning to the underlying hidden states of the HMMs in each case; and recreate representative traces for each application. Firstly, the HMM is applied to Flash Memory data in the form of operation type traces to achieve a workload model. Secondly, the HMM is used to decode a data trace formed of hospital patient arrivals creating a Hospital Arrivals model. Both of these models will be validated using averages from the raw and HMM-generated traces and also by comparison of autocorrelation functions. Another aim of the project is to create a novel adaptation of the Baum-Welch algorithm using Flash Memory data. It is known that discrete HMMs can effectively learn long sequences of observations such as workload access patterns in computer storage systems. However, there is now increasing demand for systems which handle higher density, additional loads as seen in storage workload modelling [1], where workloads can be characterized on-line. Thus, we derive a sliding version of the Baum-Welch algorithm, which constantly updates its observation set, discarding old data points in the time series as it inputs new ones. We refer to a HMM with this sliding Baum-Welch algorithm as a SlidHMM due to the fact that it slides across the time series, updating its parameters ”on-the-fly”. The benefit of this novel approach is to obtain a parsimonious model which updates its encoded information whenever more real time workload data becomes available. The SlidHMM is also efficient in keeping track of non-homogeneous processes because it updates the observation set at different stages of the analysis, therefore analysing only the current portion of the time series. An analysis of an efficient process to identify the optimal number of hidden states for a HMM is also discussed, but left mostly as future work. Also reserved as extensions are: the choice of a different clustering algorithm for each model; and a new approximation for the backward variables in the Baum-Welch algorithm to seek an improvement on the SlidHMM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Storage Workload Modelling by Hidden Markov Models: Application to FLASH Memory

A workload analysis technique is presented that processes data from operation type traces and creates a Hidden Markov Model (HMM) to represent the workload that generated those traces. The HMM can be used to create representative traces for performance models, such as simulators, avoiding the need to repeatedly acquire suitable traces. It can also be used to estimate directly the transition pro...

متن کامل

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

Infinite Hidden Semi-Markov Modulated Interaction Point Process

The correlation between events is ubiquitous and important for temporal events modelling. In many cases, the correlation exists between not only events’ emitted observations, but also their arrival times. State space models (e.g., hidden Markov model) and stochastic interaction point process models (e.g., Hawkes process) have been studied extensively yet separately for the two types of correlat...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Queueing Analysis of Markov Modulated ON/OFF Arrivals with Geometric Service Times

1. INTRODUCTION There has been extensive work focused on modeling discrete-time queueing systems with correlated arrivals [1], [2], [3], [4]. Specific attention has been drawn to queueing models with application to data switching, where bursty arrival streams represent more accurately real-life network traffic. To that end, the Markov modulated ON-OFF model has been frequently incorporated as a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011